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Here we report the whole-genome shotgun sequence of a Peruvian strain of Arthrospira platensis (Paraca), a cultivated and edi- 
ble haloalkaliphilic cyanobacterium of great scientific, technical, and economic potential. 
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71 rthrospira platensis, a filamentous cyanobacterium, forms 
#■ massive populations in tropical water bodies characterized by 
high levels of carbonate and bicarbonate and pH up to 11 ( 1 ). It is 
an edible microorganism whose consumption remains traditional 
in Chad (2). Well-documented nutritional and therapeutic prop- 
erties of A. platensis, mainly as a tool against malnutrition (3), 
have elicited a growing market of several thousand tons yearly. As 
an extremophile photosynthetic microorganism, A. platensis is 
suitable for mass-production (4), including for the mass- 
production of highly valued biopolymers (5). The high pH (up to 
11.5) of its fully mineral culture medium provides an efficient 
barrier against other bacterial contamination (6). Moreover, it is a 
strong candidate for life-sustaining systems in future spatial or 
submarine missions (7), and is also considered to be, along with 
other photosynthetic aquatic microorganisms, a potential biofuel 
source (8). 

Arthrospira platensis strain Paraca was provided by Biorigin SA 
(Switzerland). We realized axenic cultures cultivated in a modi- 
fied Zarrouk medium at 25°C (6). Cultures were checked for bac- 
terial contaminations prior to DNA extraction, performed on a 
100 mg (FW) filtered sampled according to a modified DNA ex- 
traction micro-method (9). Whole-genome shotgun sequencing 
of the Paraca strain was then carried out in an Illumina HiSeq 
2000, producing 22,924,753 paired-end reads 100 bp long. Assem- 
bling conducted with SPAdes v3.0 (10) led to 268 contigs for a 
genome length of 6,501,886 bp (contig N 50 of 72,660 bp) with an 
average G+C content of 44.31%. Annotation through the Pro- 
karyotic Genomes Automatic Annotation Pipeline Group 
(PGAAPG) predicted 5,824 genes including 5,439 protein coding 
sequences (CDs). It also identified 6 clustered regularly inter- 
spaced short palindromic repeat (CRISPR) arrays, 388 pseudo- 
genes, and 47 RNA genes while RAST analysis (11) identified 
7,605 CDs and 46 RNA genes (40 tRNA and 6 rRNA genes). A 
comparison of this sequence to previously registered sequences of 
Arthrospira genomes has been published ( 12). As observed in Ar- 



throspira sp. PCC 8005 (13), numerous gene members of the same 
metabolic pathways are dispersed over the genome. Data analysis 
identified genes coding for hydrogenases involved in H 2 produc- 
tion such as Hox and Hyp loci and 1 complete and 2 partial nitro- 
genase sequences. Regarding the nitrogen metabolism, this strain 
is equipped for nitrate and nitrite uptaking, ammonium synthesis, 
ammonium assimilation, and for nitrogen assimilation through a 
typical Mo-dependent nitrogenase. 

Numerous genes involved in carotene, thiamine, tocopherol, 
biotin, cobalamin, and quinones metabolisms are present. The 
strain also possesses genes of resistance to semi-metals and metals 
such as arsenic, copper, cobalt, zinc, and cadmium as well as a 
mercuric ion reductase. No plasmid sequence or complete phage 
sequences were found, whereas a few interspersed prophage pro- 
tein or plasmid stabilization protein sequences were present. Fur- 
thermore, no genes coding for known cyanobacterial toxins bio- 
synthesis pathways were detected. An in-depth study of the 
genome of A. platensis should elucidate the mechanisms involved 
in this bacterium's survival in hyperalkaline environments, and 
suggest possible industrial usages. 

Nucleotide sequence accession numbers. This whole-genome 
shotgun project has been deposited at DDBI/EMBL/GenBank un- 
der the accession no. ACSK00000000. The version described in 
this paper is version ACSK03000000 (GenBank Assembly ID 
GCA_000175415.3). 
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